Video coding

ABSTRACT

A method and apparatus for encoding an input video bitstream to produce an encoded output bitstream is disclosed. A base stream is enhanced based on enhancement control parameters. At least one picture content parameter is extracted from the enhanced base stream. At least one picture content parameter is extracted from the input video bitstream. The enhanced picture content parameters are compared with the input picture content parameters. An output from the comparison step is received and the enhancement control parameters are calculated so as to minimize the difference between the input picture content parameters and the enhanced picture content parameters. The calculated control parameters are incorporated into the encoded output bitstream.

FIELD OF THE INVENTION

The invention relates to video coding, and more particularly to spatialscalable video compression schemes.

BACKGROUND OF THE INVENTION

Because of the massive amounts of data inherent in digital video, thetransmission of full-motion, high-definition digital video signals is asignificant problem in the development of high-definition television.More particularly, each digital image frame is a still image formed froman array of pixels according to the display resolution of a particularsystem. As a result, the amounts of raw digital information included inhigh-resolution video sequences are massive. In order to reduce theamount of data that must be sent, compression schemes are used tocompress the data. Various video compression standards or processes havebeen established, including, MPEG-2, MPEG-4, and H.264.

Many applications are enabled where video is available at variousresolutions and/or qualities in one stream. Methods to accomplish thisare loosely referred to as scalability techniques. There are three axeson which one can deploy scalability. The first is scalability on thetime axis, often referred to as temporal scalability. Secondly, there isscalability on the quality axis, often referred to as signal-to-noisescalability or fine-grain scalability. The third axis is the resolutionaxis (number of pixels in image) often referred to as spatialscalability or layered coding. In layered coding, the bitstream isdivided into two or more bitstreams, or layers. Each layer can becombined to form a single high quality signal. For example, the baselayer may provide a lower quality video signal, while the enhancementlayer provides additional information that can enhance the base layerimage.

In particular, spatial scalability can provide compatibility betweendifferent video standards or decoder capabilities. With spatialscalability, the base layer video may have a lower resolution than theinput video sequence, in which case the enhancement layer carriesinformation which can restore the resolution of the base layer to theinput sequence level.

FIG. 1 illustrates a known layered video encoder 100. The depictedencoding system 100 accomplishes layer compression, whereby a portion ofthe channel is used for providing a low resolution base layer and theremaining portion is used for transmitting edge enhancement information,whereby the two signals may be recombined to bring the system up tohigh-resolution. The high resolution video input Hi-RES is split bysplitter 102 whereby the data is sent to a low pass filter 104 and asubtraction circuit 106. The low pass filter 104 reduces the resolutionof the video data, which is then fed to a base encoder 108. In general,low pass filters and encoders are well known in the art and are notdescribed in detail herein for purposes of simplicity. The encoder 108produces a lower resolution base stream which is provided to a secondsplitter 110 from where it is output from the system 100. The basestream can be broadcast, received and via a decoder, displayed as is,although the base stream does not provide a resolution which would beconsidered as high-definition.

The other output of the splitter 110 is fed to a decoder 112 within thesystem 100. From there, the decoded signal is fed into an interpolateand upsample circuit 114. In general, the interpolate and upsamplecircuit 114 reconstructs the filtered out resolution from the decodedvideo stream and provides a video data stream having the same resolutionas the high-resolution input. However, because of the filtering and thelosses resulting from the encoding and decoding, certain errors arepresent in the reconstructed stream. These errors are determined in thesubtraction circuit 106 by subtracting the reconstructed high-resolutionstream from the original, unmodified high-resolution stream. The outputof the subtraction circuit 106 is fed to an enhancement encoder 116which outputs a reasonable quality enhancement stream.

The disadvantage of filtering and downscaling the input video to a lowerresolution and then compressing it is that the video loses sharpness.This can to a certain degree be compensated for by using sharpnessenhancement after the decoder. Picture enhancement techniques normallyare controlled by analyzing the enhance output signal. If the originalfull resolution signal is used as a reference, the enhancement controlcan be improved. However, normally such a reference is not present forexample in television sets. However, in some application, e.g., spatialscalable compression, such a reference signal is present. The problem,however, becomes how to make use of this reference. One possibility isto look to the pixel difference of the reference and the enhanced outputsignal. Control can be achieved by minimizing the difference energy.However, this method does not really take into account how the human eyeperceives a picture as sharp. It is known that picture contentparameters from a picture can be extracted which take into account howthe human eye perceives a picture as sharp. Here the control algorithmtries to maximize these values, with the danger of overdoing it,resulting in sharp but not quite natural pictures. The problem is how touse these extracted picture content parameters when there is also areference picture available to control picture enhancement.

SUMMARY OF THE INVENTION

The invention overcomes the deficiencies of other known layeredcompression schemes by using picture content parameters for both theenhanced output signal and the reference signal. A control algorithmcontrols the enhancement of the base stream in such a manner that thedifference between the picture content parameters of the enhanced outputsignal and the reference signal becomes as low as possible. Thisprevents the enhancement from being overdone and results in sharpnatural pictures.

According to one embodiment of the invention, a method and apparatus forencoding an input video bitstream to produce an encoded output bitstreamis disclosed. A base stream is enhanced based on enhancement controlparameters. At least one picture content parameter is extracted from theenhanced base stream. At least one picture content parameter isextracted from the input video bitstream. The enhanced picture contentparameters are compared with the input picture content parameters. Anoutput from the comparison step is received and the enhancement controlparameters are calculated so as to minimize the difference between theinput picture content parameters and the enhanced picture contentparameters. The calculated control parameters are incorporated into theencoded output bitstream.

These and other aspects of the invention will be apparent from andelucidated with reference to the embodiments described hereafter.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be described, by way of example, with referenceto the accompanying drawings, wherein:

FIG. 1 is a block diagram representing a known layered video encoder;

FIG. 2 is a block diagram of a layered video encoder/decoder accordingto one embodiment of the invention;

FIGS. 3 a-3 b illustrate DCT coefficient energy level curves accordingto one embodiment of the invention;

FIG. 4 is a block diagram of a decoder according to one embodiment ofthe invention;

FIG. 5 is a block diagram of a decoder according to another embodimentof the invention; and

FIG. 6 is a block diagram of a decoder according to another embodimentof the invention.

DETAILED DESCRIPTION OF THE INVENTION

According to one embodiment of the invention, a spatial scalablecompression scheme using spatial sharpness enhancement techniques isdisclosed. In this embodiment of the invention, picture contentinformation is extracted from both the reference signal and the enhancedoutput signal as will be described below.

This embodiment will now be described in more detail with reference toFIG. 2 which is a block diagram of an encoder which can be used with theinvention. It will be understood that the encoder can be a layeredencoder with a base layer having a relatively low resolution and atleast one enhancement layer, but the invention is not limited thereto.The depicted encoding system 200 accomplishes layer compression, wherebya portion of the channel is used for providing a low resolution baselayer and the remaining portion is used for transmitting edgeenhancement information, whereby the two signals may be recombined tobring the system up to high-resolution. The high resolution video input201 is split by a splitter 210 whereby the data is sent to a low passfilter 212, for example a nyquist filter, and a splitter 232. The lowpass filter 210 reduces the resolution of the video data, which is thenfed to a base encoder 214. In general, low pass filters and encoders arewell known in the art and are not described in detail herein forpurposes of simplicity. The base encoder 214 produces a lower resolutionbase stream 215. The base stream can be broadcasted, received and via adecoder, displayed as is, although the base stream does not provide aresolution which would be considered as high-definition.

The encoder also outputs a decoded base stream to a splitter 213 whichsplits the decoded base stream and supplies it to an upscaling circuit216 and an enhancement unit 220. In addition, a decoder (notillustrated) can be inserted into the circuit after the encoder 214 todecode the output of the encoder prior to being sent to the upscalingcircuit 216. In general, the upscaling circuit 216 reconstructs thefiltered out resolution from the decoded video stream and provides avideo data stream having the same resolution as the high-resolutioninput. The upscaled bitstream v1 from the upscaling circuit 216 isinputted into a subtraction circuit 234.

The enhancement unit 220 processes the output signal 215 and enhancesthe signal according to the enhancement algorithm(s) in the enhancementunit 220 and enhancement control parameters (“enh ctrl par”) produced bya control unit 231. Many video enhancement techniques exist and they allmodify the picture content such that the appreciation of the resultingpicture is improved. The subjective characteristic of these enhancementscomplicate the optimization process and is likely the reason for thediversity in video enhancement algorithms. Various enhancementalgorithms contribute by some means to the picture quality. Noisereduction and sharpness improvement algorithms are just a few examplesout of a large set of enhancement algorithms. It will be understood thatany of these known enhancement algorithms can be used in the invention.

The enhanced output signal 221 is provided to a picture contentparameter unit 222. The picture content parameter unit 222 extracts aplurality of picture content parameters from the enhanced output signal221. In this illustrative example three picture content parameters areextracted from the enhanced output signal 221, but the invention is notlimited thereto.

The reference signal 201 is provided to a picture content parameter unit224. The picture content parameter unit 224 extracts the same pluralityof picture content parameters from the reference signal 201 as thepicture content parameter unit 222 extracts from the enhanced outputsignal 221. The picture content parameters can be globally frame based,but also be group of pixels based, e.g., 16*16 pixels. Examples ofpicture content parameters extracted from a picture or group of pixelscomprises but is not limited thereto: difference between maximum andminimum value of a group of pixels; edge steepness value at center ofedges, DCT coefficient High Frequency energy levels, etc. FIG. 3 aillustrates a DCT coefficient energy level curve of the reference signal201, and FIG. 3 b illustrates a DCT coefficient energy level curve ofthe enhanced output signal 221.

The extracted picture content parameters from the reference picturecontent parameter unit 224 and the enhanced picture content parameterunit 222 are supplied to a comparison unit comprising, for example, atleast one subtraction unit 226 and multiplication units 228. It will beunderstood by those skilled in the art that the comparison unit can becomprised of other elements as well. The subtraction units 226 subtractthe enhanced picture content parameters from the reference picturecontent parameters. The output of each subtraction unit 226 canoptionally be supplied to multiplication units 228 which multiples theoutputs by predetermined factors (Cl, C2, C3). The outputs of themultiplication unit are summed together in a summation unit 230 andsupplied to the control unit 231. The control unit 231 processes theinformation received from the summation unit 230 and produces newenhancement control parameters. According to one embodiment of theinvention, the control unit 231 controls the enhancement unit 220 viathe enhancement control parameters so that the difference between thepicture content parameters of the reference signal and the enhancedoutput signal becomes as low as possible. This also prevents theenhancement from being overdone which normally results in sharp but notquite natural pictures.

The upscaled output of the upscaling circuit 216 is subtracted from theoriginal input 201 in a subtraction circuit 234 to produce a residualbitstream which is applied to a switch 236. The switch is controlled bythe output (S) of the control unit 231. By comparing the input videobitstream 201 with the enhanced base video stream, the control unit 231can determine which pixels or groups of pixels (blocks) need to befurther enhanced by the enhancement layer 208. For the pixels or groupsof pixels (blocks) that are determined to need enhancement by thecontrol unit 231, the control unit 231 outputs the control signal (S) toclose switch 236 to let those parts of the residual bitstream through tothe enhancement layer encoder 240. The control unit 231 also sends theselected enhancement control parameters and the control signal for theswitch 236 to the encoder 240 so that this information is incorporated(multiplexed) with the resulting residual bitstream in the enhancementstream 241.

FIG. 4 illustrates a decoder 400 which can be used to decode the baseand enhancement streams from the encoder 200 according to one embodimentof the invention. In this embodiment, the base stream 215 is decoded bya base decoder 402 and the enhancement stream 241 is decoded by anenhancement decoder 404. The decoded base stream is supplied to anupconverter 406 and an enhancement unit 408. The decode enhancementstream is supplied to an addition unit 410. The addition unit 410 addsthe decoded enhancement stream to the upconverted base stream from theupconverter 406 and provides the combined stream to one side of a switch414.

The enhancement encoder also removes the signal S and the enhancementcontrol parameters from the enhancement stream via a multiplexer (notillustrated) and provided the signal S and the enhancement controlparameters to an enhancement control unit 412. The enhancement controlunit 412 provides the signal S to the switch 414 and the enhancementcontrol parameters to the enhancement unit 408. The enhancement unit 408enhances the decoded base stream according to the enhancement algorithmsin the enhancement unit 408 and the enhancement control parametersprovided by the enhancement control unit 412. The enhanced base streamis then provided to the other side of the switch 414. Depending on theposition of the switch as determined by the signal S, the decoder 400outputs either the combined stream from the addition unit 410 or theenhanced base stream.

According to another embodiment of the invention, the output of thedecoder 400 may be a combination of the combined stream from theaddition unit 410 and the enhanced base stream from the enhancement unit408. As illustrated in FIG. 5, the signal s is provided to a pair ofmultiplication units 502 and 504, where S is a value between 0 and 1. Inthis illustrative example, the multiplication unit 502 multiples thecombination stream from the addition unit 410 by the value of (1-S). Themultiplication unit 504 multiplies the enhanced base stream by the valueS. The outputs of the two multiplication units is combined in theaddition unit 506 to form the output of the decoder.

In another embodiment of the invention, the output of the enhancementencoder section of the encoder 200 can be muted out by the control unit231 or some other device. As a result, there is no enhancement streamoutputted from the encoder 200. In this illustrative example, theenhancement control parameters are created as described above, but areprovided to the base encoder 214 via the dashed line 251 in FIG. 2. Theenhancement control parameters are then incorporated into the encodedbase stream 215 via a multiplexer in the base encoder.

The encoded base stream 215 with the incorporated enhancement controlparameters can then be decoded by the decoder 600 illustrated in FIG. 6.The encoded base stream is decoded in the base decoder 602 and thedecoded base stream is provided to an enhancement unit 604. The basedecoder 602 also separates the enhancement control parameters from theencoded base stream 215 and supplies them to an enhancement control unit606. The decoded base stream is then enhanced by the enhancement unit604 according to the enhancement algorithms in the enhancement unit 604and the enhancement control parameters from the enhancement control unit606. The enhanced decoded base stream is then outputted from the decoder600.

The above-described embodiments of the invention optimize picturesharpness or quality by using a control unit to control enhancementcontrol parameters in such a manner so that the difference betweenpicture content parameters from a reference signal and an enhancedsignal become as low as possible.

It should be noted that the above-mentioned embodiments illustraterather than limit the invention, and that those skilled in the art willbe able to design many alternative embodiments without departing fromthe scope of the appended claims. In the claims, any reference signsplaced between parentheses shall not be construed as limiting the claim.The word ‘comprising’ does not exclude the presence of other elements orsteps than those listed in a claim. The invention can be implemented bymeans of hardware comprising several distinct elements, and by means ofa suitably programmed computer. In a device claim enumerating severalmeans, several of these means can be embodied by one and the same itemof hardware. The mere fact that certain measures are recited in mutuallydifferent dependent claims does not indicate that a combination of thesemeasures cannot be used to advantage.

1. An encoder for encoding an input video bitstream to produce anencoded output bitstream, comprising: an enhancement unit (220) forenhancing a base stream based on enhancement control parameters; a firstpicture content parameter unit (222) for extracting at least one picturecontent parameter from the enhanced base stream; a second picturecontent parameter unit (224) for extracting at least one picture contentparameter from the input video bitstream; comparison means (226, 228)for comparing the enhanced picture content parameters with the inputpicture content parameters; a control unit (231) for receiving an outputfrom the comparison means and for calculating said enhancement controlparameters which will minimize the difference between the input picturecontent parameters and the enhanced picture content parameters; means(240) for incorporating the calculated control parameters into theencoded output bitstream.
 2. The encoder according to claim 1, whereinthe encoder is a layered encoder with a base layer and at least oneenhancement layer.
 3. The encoder according to claim 2, wherein thelayered encoder is a spatial layered encoder where the base layer is ofa relatively low resolution.
 4. The encoder according to claim 3,further comprising: means (231) for muting the input of the enhancementencoder when the difference between the input picture content parametersand the enhancement picture content parameters meet a predeterminedcriteria.
 5. The encoder according to claim 1, wherein the differencebetween selected picture content parameters is multiplied by apredetermined value prior to being inputted into the control unit. 6.The encoder according to claim 5, further comprising: a summation means(230) for summing together the outputs of the comparison means whichhave been multiplied by the predetermined values.
 7. The encoderaccording to claim 1, wherein the picture content parameters are fromthe group comprising difference between maximum and minimum value of agroup of pixels, edge steepness value at center of edges, DCTCoefficient High Frequency energy level curves.
 8. An encoder forencoding an input video bitstream, comprising: a downsampling unit (212)for reducing the resolution of the input video bitstream; a base encoder(214) for encoding the lower resolution base stream; an upscaling unit(216) for decoding and increasing the resolution of the base stream toproduce an upscaled base bitstream; an enhancement unit (220) forenhancing the base stream based on enhancement control parameters; afirst picture content parameter unit (222) for extracting at least onepicture content parameter from the enhanced base stream; a secondpicture content parameter unit (224) for extracting at least one picturecontent parameter from the input video bitstream; a comparison means(226, 228) for comparing the enhanced picture content parameters fromthe input picture content parameters; a control unit (231) for receivingan output from the comparison means and for calculating said enhancementcontrol parameters which will minimize the difference between the inputpicture content parameters and the enhanced picture content parameters;a subtraction unit (234) for subtracting the upscaled base bitstreamfrom the input video bitstream to produce a residual bitstream;switching means (236) for selectively allowing only portions of theresidual bitstream to be sent to an enhancement encoder based upon acontrol signal from the control unit; an enhancement encoder (240) forincorporating the portions of the residual bitstream which pass throughthe switching means with said enhancement control parameters to form theencoded residual bitstream.
 9. The encoder according to claim 8, whereinsaid switching means is a multiplier having a value between 0 and 1,wherein a value of 0 means the switching means is open and a value of 1means the switching means is closed.
 10. The encoder according to claim8, wherein the comparison between selected picture content parameters ismultiplied by a predetermined value prior to being inputted into thecontrol unit.
 11. The layered encoder according to claim 10, furthercomprising: a summation means (230) for summing together the outputs ofthe comparison means which have been multiplied by the predeterminedvalues.
 12. The encoder according to claim 8, wherein the picturecontent parameters are from the group comprising difference betweenmaximum and minimum value of a group of pixels, edge steepness value atcenter of edges, DCT Coefficient High Frequency energy level curves. 13.A method for encoding an input video bitstream in an encoder to producean encoded output stream, comprising the steps of: enhancing a basestream based on enhancement control parameters; extracting at least onepicture content parameter from the enhanced base stream; extracting atleast one picture content parameter from the input video bitstream;comparing the enhanced picture content parameters from the input picturecontent parameters; receiving an output from the comparison step andcalculating said enhancement control parameters which will minimize thedifference between the input picture content parameters and the enhancedpicture content parameters; incorporating the calculated controlparameters into the encoded output bitstream.
 14. A decoder for decodingan input stream with incorporated enhancement control parameters,comprising: a decoder (602) for decoding the input signal and separatingthe enhancement control parameters from the decoded signal; anenhancement unit (604, 606) for enhancing the decoded signal based onsaid enhancement control parameters.
 15. A decoder for decodingcompressed video information, comprising: a base stream decoder (402)for decoding a received base stream; an upconverting unit (406) forincreasing the resolution of the decoded base stream; an enhancementstream decoder (404) for decoding a received enhancement stream and forseparating imbedded enhancement control parameters from the enhancementstream; a first addition unit (410) for combining the upconverteddecoded base stream and the decoded enhancement stream; enhancementmeans (408, 412) for enhancing the decoded base stream using saidenhancement control parameters; and switch means (414) for selecting tooutput either the combined streams from the addition unit or theenhanced base stream.
 16. The decoder according to claim 15, furthercomprising instead of said switch means: a first multiplication unit(502) for multiplying the output of the first addition unit by a firstpredetermined value; a second multiplication unit (504) for multiplyingthe enhancement base stream by a second predetermined value; and asecond addition means (506) for adding outputs from the first and secondmultiplication units to form an output stream.